180 research outputs found

    Discriminative variable selection for clustering with the sparse Fisher-EM algorithm

    Full text link
    The interest in variable selection for clustering has increased recently due to the growing need in clustering high-dimensional data. Variable selection allows in particular to ease both the clustering and the interpretation of the results. Existing approaches have demonstrated the efficiency of variable selection for clustering but turn out to be either very time consuming or not sparse enough in high-dimensional spaces. This work proposes to perform a selection of the discriminative variables by introducing sparsity in the loading matrix of the Fisher-EM algorithm. This clustering method has been recently proposed for the simultaneous visualization and clustering of high-dimensional data. It is based on a latent mixture model which fits the data into a low-dimensional discriminative subspace. Three different approaches are proposed in this work to introduce sparsity in the orientation matrix of the discriminative subspace through 1\ell_{1}-type penalizations. Experimental comparisons with existing approaches on simulated and real-world data sets demonstrate the interest of the proposed methodology. An application to the segmentation of hyperspectral images of the planet Mars is also presented

    Classification générative des données de grande dimension : état de l'art et avancées récentes

    No full text
    National audienceLa classification générative a du faire face ces dernières années à l'augmentation de la dimension des données et au fléau de la dimension qui lui est associée

    The discriminative functional mixture model for a comparative analysis of bike sharing systems

    Get PDF
    Bike sharing systems (BSSs) have become a means of sustainable intermodal transport and are now proposed in many cities worldwide. Most BSSs also provide open access to their data, particularly to real-time status reports on their bike stations. The analysis of the mass of data generated by such systems is of particular interest to BSS providers to update system structures and policies. This work was motivated by interest in analyzing and comparing several European BSSs to identify common operating patterns in BSSs and to propose practical solutions to avoid potential issues. Our approach relies on the identification of common patterns between and within systems. To this end, a model-based clustering method, called FunFEM, for time series (or more generally functional data) is developed. It is based on a functional mixture model that allows the clustering of the data in a discriminative functional subspace. This model presents the advantage in this context to be parsimonious and to allow the visualization of the clustered systems. Numerical experiments confirm the good behavior of FunFEM, particularly compared to state-of-the-art methods. The application of FunFEM to BSS data from JCDecaux and the Transport for London Initiative allows us to identify 10 general patterns, including pathological ones, and to propose practical improvement strategies based on the system comparison. The visualization of the clustered data within the discriminative subspace turns out to be particularly informative regarding the system efficiency. The proposed methodology is implemented in a package for the R software, named funFEM, which is available on the CRAN. The package also provides a subset of the data analyzed in this work.Comment: Published at http://dx.doi.org/10.1214/15-AOAS861 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Kernel discriminant analysis and clustering with parsimonious Gaussian process models

    Full text link
    This work presents a family of parsimonious Gaussian process models which allow to build, from a finite sample, a model-based classifier in an infinite dimensional space. The proposed parsimonious models are obtained by constraining the eigen-decomposition of the Gaussian processes modeling each class. This allows in particular to use non-linear mapping functions which project the observations into infinite dimensional spaces. It is also demonstrated that the building of the classifier can be directly done from the observation space through a kernel function. The proposed classification method is thus able to classify data of various types such as categorical data, functional data or networks. Furthermore, it is possible to classify mixed data by combining different kernels. The methodology is as well extended to the unsupervised classification case. Experimental results on various data sets demonstrate the effectiveness of the proposed method

    Theoretical and practical considerations on the convergence properties of the Fisher-EM algorithm

    Get PDF
    International audienceThe Fisher-EM algorithm has been recently proposed in (Bouveyron2011) for the simultaneous visualization and clustering of high-dimensional data. It is based on a latent mixture model which fits the data into a latent discriminative subspace with a low intrinsic dimension. Although the Fisher-EM algorithm is based on the EM algorithm, it does not respect at a first glance all conditions of the EM convergence theory. Its convergence toward a maximum of the likelihood is therefore questionable. The aim of this work is two folds. Firstly, the convergence of the Fisher-EM algorithm is studied from the theoretical point of view. It is in particular proved that the algorithm converges under weak conditions in the general case. Secondly, the convergence of the Fisher-EM algorithm is considered from the practical point of view. It is shown that the Fisher's criterion can be used as stopping criterion for the algorithm to improve the clustering accuracy. It is also shown that the Fisher-EM algorithm converges faster than both the EM and CEM algorithm

    Exact Dimensionality Selection for Bayesian PCA

    Get PDF
    We present a Bayesian model selection approach to estimate the intrinsic dimensionality of a high-dimensional dataset. To this end, we introduce a novel formulation of the probabilisitic principal component analysis model based on a normal-gamma prior distribution. In this context, we exhibit a closed-form expression of the marginal likelihood which allows to infer an optimal number of components. We also propose a heuristic based on the expected shape of the marginal likelihood curve in order to choose the hyperparameters. In non-asymptotic frameworks, we show on simulated data that this exact dimensionality selection approach is competitive with both Bayesian and frequentist state-of-the-art methods

    Adaptive mixtures of regressions: Improving predictive inference when population has changed

    No full text
    International audienceThe present work investigates the estimation of regression mixtures when population has changed between the training and the prediction stages. Two approaches are proposed: a parametric approach modelling the relationship between dependent variables of both populations, and a Bayesian approach in which the priors on the prediction population depend on the mixture regression parameters of the training population. The relevance of both approaches is illustrated on simulations and on an environmental dataset

    Probabilistic Fisher discriminant analysis: A robust and flexible alternative to Fisher discriminant analysis

    No full text
    International audienceFisher discriminant analysis (FDA) is a popular and powerful method for dimensionality reduction and classification. Unfortunately, the optimality of the dimension reduction provided by FDA is only proved in the homoscedastic case. In addition, FDA is known to have poor performances in the cases of label noise and sparse labeled data. To overcome these limitations, this work proposes a probabilistic framework for FDA which relaxes the homoscedastic assumption on the class covariance matrices and adds a term to explicitly model the non-discriminative information. This allows the proposed method to be robust to label noise and to be used in the semi-supervised context. Experiments on real-world datasets show that the proposed approach works at least as well as FDA in standard situations and outperforms it in the label noise and sparse label cases

    On the estimation of the latent discriminative subspace in the Fisher-EM algorithm

    Get PDF
    International audienceThe Fisher-EM algorithm has been recently proposed in [2] for the simultaneous visualization and clustering of high-dimensional data. It is based on a discriminative latent mixture model which fits the data into a latent discriminative subspace with an intrinsic dimension lower than the dimension of the original space. The Fisher-EM algorithm includes an F-step which estimates the projection matrix whose columns span the discriminative latent space. This matrix is estimated via an optimization problem which is solved using a Gram-Schmidt procedure in the original algorithm. Unfortunately, this procedure suffers in some case from numerical instabilities which may result in a deterioration of the visualization quality or the clustering accuracy. Two alternatives for estimating the latent subspace are proposed to overcome this limitation. The optimization problem of the F-step is first recasted as a regression-type problem and then reformulated such that the solution can be approximated with a SVD. Experiments on simulated and real datasets show the improvement of the proposed alternatives for both the visualization and the clustering of data

    Adaptive Linear Models for Regression: improving prediction when population has changed

    No full text
    International audienceThe general setting of regression analysis is to identify a relationship between a response variable Y and one or several explanatory variables X by using a learning sample. In a prediction framework, the main assumption for predicting Y on a new sample of observations is that the regression model Y=f(X)+e is still valid. Unfortunately, this assumption is not always true in practice and the model could have changed. We therefore propose to adapt the original regression model to the new sample by estimating a transformation between the original regression function f(X) and the new one f*(X). The main interest of the proposed adaptive models is to allow the build of a regression model for the new population with only a small number of observations using the knowledge on the reference population. The efficiency of this strategy is illustrated by applications on artificial and real datasets, including the modeling of the housing market in different U.S. cities. A package for the R software dedicated to the adaptive linear models is available on the author's web page
    corecore